Die Modellierung von Turn-taking in einem korpusbasierten Chatbot / Modelling turn-taking in a corpus-trained chatbot
نویسندگان
چکیده
A chatbot system is a conversational agent that interacts with a user turn by turn using natural language. To simulate human chat, we developed a java program to convert a machine readable text (corpus) to the ALICE-AIML chatbot linguistic knowledge representation format. An important problem in corpus-trained chatbot development is how to model dialogue turn-taking. In the ALICE-AIML representation, the corpus is mapped onto a list of rules or Categories, pairing Pattern (user input) with appropriate Template (system response). We present different turn-taking models, depending on the type of corpus: spoken dialogue transcripts such as the Corpus of Spoken Afrikaans and the spoken part of the British National Corpus; partly-structured monologue such as the verses of the Qur'an; and texts showing clear turn-taking, such as Loebner Prize contest transcripts, and Frequently Asked Questions (FAQ) websites. Our turn-taking model is simple enough to be generic and learnable from this wide range of training corpora; yet is adequate to give reasonable user satisfaction in user trials. Ein Chatbot ist ein Agent, der mit einem Benutzer in natürlicher Sprache interagiert, wobei der Benutzer und der Agent sich mit der Ein/Ausgabe abwechseln (turn-taking). Um menschliche Unterhaltung zu simulieren, entwickelten wir ein Java Programm, das einen maschinenlesbaren Text (Korpus) in das Wissensrepräsentationsformat des ALICE-AIML Chatbots umwandelt. Ein wichtiges Problem in der Entwicklung eines Chatbots, der auf Korpora trainiert wird, ist die Modellierung des Turn-taking im Dialog. In der ALICE-AIML Repräsentation wird der Korpus auf eine Liste von Regeln oder Kategorien abgebilded, die ein Muster (Benutzereingabe) mit einem geeigeneten Templage (Systemantwort) paaren. Wir stellen verschiedene Turn-taking Modelle vor, die vom Korpustyp abhängig sind: transkribierte Dialoge wie der Korpus des gesprochenen Afrikaans und der gesprochene Teil des British National Corpus; teilstrukturierete Monologe wie die Verse des Koran; und Texte, die klares Turn-taking Verhalten aufweisen, wie die transkribierten Texte des Loebner Preis Wettbewerbs, und FAQ Internetseiten. Unser turn-taking Modell ist einfach genug, um allgemein anwendbar auf und lernbar von verschiedenen Trainingkorpora zu sein; dennoch ist es beschreibungsadequat genug, um Zufriedenheit der Benutzer in Benutzerexperimenten zu garantieren.
منابع مشابه
Snowbot: An empirical study of building chatbot using seq2seq model with different machine learning framework
Chatbot is a growing topic, we built a open domain generative chatbot using seq2seq model with different machine learning framework (Tensorflow, MXNet). Our result show although seq2seq is a successful method in neural machine translation, use it solely on single turn chatbot yield pretty unsatisfactory result. Also existing free dialog corpus lacks both quality and quantity. Our conclusion it’...
متن کاملUsing the Corpus of Spoken Afrikaans to generate an Afrikaans chatbot
This paper presents two chatbot systems, ALICE and Elizabeth, illustrating the dialogue knowledge representation and pattern matching techniques of each. We discuss the problems which arise when using the Corpus of Spoken Afrikaans (Korpus Gesproke Afrikaans) to retrain the ALICE chatbot system with human dialogue examples. A Java program to convert from dialog transcripts to the AIML linguisti...
متن کاملChatbots: Can They Serve as Natural Language Interfaces to Qa Corpus?
A chatbot is a program which can chat in natural language, on a topic built into the chatbot’s internal knowledge model. Many chatbots exist, with different knowledge-bases programmed by the chatbot builders. We have built a system to convert a website text (corpus) to a chatbot knowledge-base format. In this paper the chatbot is used as a question answer interface, where TRE09 QA track is used...
متن کاملA Chatbot as a Novel Corpus Visualization Tool
The classical way of viewing data set is using the visualization process, which maps the data from numerical or textual form to a visual representation that our mind can easily interpret such as: using graphical diagrams, charts, and geometric representation. In this paper we introduce a new idea to visualize a dialogue corpus using a chatbot interface tool. We developed a java program to conve...
متن کاملUsing dialogue corpora to train a chatbot
This paper presents two chatbot systems, ALICE and Elizabeth, illustrating the dialogue knowledge representation and pattern matching techniques of each. We discuss the problems which arise when using the Dialogue Diversity Corpus to retrain a chatbot system with human dialogue examples. A Java program to convert from dialog transcript to AIML format provides a basic implementation of corpusbas...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005